NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

SC-Bench: A Large-Scale Dataset for Smart Contract Auditing

https://doi.org/10.1109/LLM4Code66737.2025.00012

Xia, Shihao; He, Mengting; Song, Linhai; Zhang, Yiying (May 2025, IEEE)

There is a huge demand to ensure the compliance of smart contracts listed on blockchain platforms to safety and economic standards described in natural languages. Today, manual efforts in the form of auditing are commonly used to achieve this goal. ML-based automated techniques have the promise to alleviate human efforts and the resulting monetary costs. However, unlike other domains where ML techniques have had huge successes, no systematic ML techniques have been proposed or applied to smart contract auditing. We present SC-Bench, the first dataset for automated smart-contract auditing research. SC-Bench consists of 5,377 real-world smart contracts running on Ethereum, a widely used blockchain platform, and 15,975 violations of standards on Ehereum called ERCs. Out of these violations, 139 are real violations programmers made. The remaining are errors systematically injected by us to reflect the violations of different ERC rules. We evaluate SC-Bench using GPT-4 by prompting it with both the contracts and ERC rules. In addition, we manually identify each violated rule and the corresponding code site (i.e., oracle) and prompt GPT-4 with the information asking for a True-or-False question. Our results show that without the oracle, GPT-4 can only detect 0.9% violations, and with the oracle, it detects 22.9% violations. These results show the potential room for improvement in ML-based techniques for smart-contract auditing.
more » « less
Full Text Available
Rust-lancet: Automated Ownership-Rule-Violation Fixing with Behavior Preservation

https://doi.org/10.1145/3597503.3639103

Yang, Wenzhang; Song, Linhai; Xue, Yinxing (April 2024, Proceedings of the IEEE/ACM 46th International Conference on Software Engineering (ICSE'2024))
How to Save My Gas Fees: Understanding and Detecting Real-World Gas Issues in Solidity Programs

https://doi.org/10.1109/TSE.2025.3593930

He, Mengting; Xia, Shihao; Qin, Boqin; Yoshida, Nobuko; Yu, Tingting; Zhang, Yiying; Song, Linhai (September 2025, IEEE Transactions on Software Engineering)

The execution of smart contracts on Ethereum, a public blockchain system, incurs a fee called gas fee for its computation and data storage. When programmers develop smart contracts (e.g., in the Solidity programming language), they could unknowingly write code snippets that unnecessarily cause more gas fees. These issues, or what we call gas wastes, can lead to significant monetary losses for users. This paper takes the initiative in helping Ethereum users reduce their gas fees in two key steps. First, we conduct an empirical study on gas wastes in open-source Solidity programs and Ethereum transaction traces. Second, to validate our study findings, we develop a static tool called PeCatch to effectively detect gas wastes in Solidity programs, and manually examine the Solidity compiler’s code to pinpoint implementation errors causing gas wastes. Overall, we make 11 insights and four suggestions, which can foster future tool development and programmer awareness, and fixing our detected bugs can save $0.76 million in gas fees daily.
more » « less
Full Text Available
Understanding and Detecting Real-World Safety Issues in Rust

https://doi.org/10.1109/TSE.2024.3380393

Qin, Boqin; Chen, Yilun; Liu, Haopeng; Zhang, Hua; Wen, Qiaoyan; Song, Linhai; Zhang, Yiying (January 2024, IEEE Transactions on Software Engineering)

Full Text Available
Learning and Programming Challenges of Rust: A Mixed-Methods Study

https://doi.org/10.1145/3510003.3510164

Zhu, Shuofei; Zhang, Ziyi; Qin, Boqin; Xiong, Aiping; Song, Linhai (May 2022, Proceedings of the 44th International Conference on Software Engineering)

Rust is a young systems programming language designed to provide both the safety guarantees of high-level languages and the execution performance of low-level languages. To achieve this design goal, Rust provides a suite of safety rules and checks against those rules at the compile time to eliminate many memory-safety and thread-safety issues. Due to its safety and performance, Rust’s popularity has increased significantly in recent years, and it has already been adopted to build many safety-critical software systems. It is critical to understand the learning and programming challenges imposed by Rust’s safety rules. For this purpose, we first conducted an empirical study through close, manual inspection of 100 Rust-related Stack Overflow questions. We sought to understand (1) what safety rules are challenging to learn and program with, (2) under which contexts a safety rule becomes more difficult to apply, and (3) whether the Rust compiler is sufficiently helpful in debugging safety-rule violations. We then performed an online survey with 101 Rust programmers to validate the findings of the empirical study. We invited participants to evaluate program variants that differ from each other, either in terms of violated safety rules or the code constructs involved in the violation, and compared the participants’ performance on the variants. Our mixed-methods investigation revealed a range of consistent findings that can benefit Rust learners, practitioners, and language designers.
more » « less
Full Text Available
Who goes first? detecting go concurrency bugs via message reordering

https://doi.org/10.1145/3503222.3507753

Liu, Ziheng; Xia, Shihao; Liang, Yu; Song, Linhai; Hu, Hong (February 2022, Proceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems)

Go is a young programming language invented to build safe and efficient concurrent programs. It provides goroutines as lightweight threads and channels for inter-goroutine communication. Programmers are encouraged to explicitly pass messages through channels to connect goroutines, with the purpose of reducing the chance of making programming mistakes and introducing concurrency bugs. Go is one of the most beloved programming languages and has already been used to build many critical infrastructure software systems in the data-center environment. However, a recent study shows that channel-related concurrency bugs are still common in Go programs, severely hurting the reliability of the programs. This paper presents GFuzz, a dynamic detector that can effectively pinpoint channel-related concurrency bugs by mutating the processing orders of concurrent messages. We build GFuzz in three steps. We first adopt an effective approach to identify concurrent messages and transform a program to process those messages in any given order. We then take a fuzzing approach to generate new processing orders by mutating exercised ones and rely on execution feedback to prioritize orders close to triggering bugs. Finally, we design a runtime sanitizer to capture triggered bugs that are missed by the Go runtime. We evaluate GFuzz on seven popular Go software systems, including Docker, Kubernetes, and gRPC. GFuzz finds 184 previously unknown bugs and reports a negligible number of false positives. Programmers have already confirmed 124 reports as real bugs and fixed 67 of them based on our reporting. A careful inspection of the detected concurrency bugs from gRPC shows the effectiveness of each component of GFuzz and confirms the components' rationality.
more » « less
Full Text Available
Beyond Bot Detection: Combating Fraudulent Online Survey Takers

https://doi.org/10.1145/3485447.3512230

Zhang, Ziyi; Zhu, Shuofei; Mink, Jaron; Xiong, Aiping; Song, Linhai; Wang, Gang (April 2022, Proceedings of the Web Conference 2022)

Different techniques have been recommended to detect fraudulent responses in online surveys, but little research has been taken to systematically test the extent to which they actually work in practice. In this paper, we conduct an empirical evaluation of 22 antifraud tests in two complementary online surveys. The first survey recruits Rust programmers on public online forums and social media networks. We find that fraudulent respondents involve both bot and human characteristics. Among different anti-fraud tests, those designed based on domain knowledge are the most effective. By combining individual tests, we can achieve a detection performance as good as commercial techniques while making the results more explainable. To explore these tests under a broader context, we ran a different survey on Amazon Mechanical Turk (MTurk). The results show that for a generic survey without requiring users to have any domain knowledge, it is more difficult to distinguish fraudulent responses. However, a subset of tests still remain effective.
more » « less
Full Text Available
Automatically detecting and fixing concurrency bugs in go software systems

https://doi.org/10.1145/3445814.3446756

Liu, Ziheng; Zhu, Shuofei; Qin, Boqin; Chen, Hao; Song, Linhai (April 2021, Proceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems)
null (Ed.)
Full Text Available
VRLifeTime -- An IDE Tool to Avoid Concurrency and Memory Bugs in Rust

https://doi.org/10.1145/3372297.3420024

Zhang, Ziyi; Qin, Boqin; Chen, Yilun; Song, Linhai; Zhang, Yiying (October 2020, Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security)
null (Ed.)
Full Text Available
Measuring and Modeling the Label Dynamics of Online Anti-Malware Engines

https://doi.org/https://www.usenix.org/conference/usenixsecurity20/presentation/zhu

Zhu, Shuofei; Shi, Jianjun; Yang, Limin; Qin, Boqin; Zhang, Ziyi; Song, Linhai; Wang, Gang (August 2020, The 29th USENIX Security Symposium (USENIX Security))

Full Text Available

« Prev Next »

Search for: All records